View Post [edit]
Poster: | gojomo | Date: | Jun 10, 2010 4:06pm |
Forum: | web | Subject: | Re: why not visible ?? |
In the meantime, it's generally not practical to check what sites and pages have been collected but are not yet indexed.
To ensure your site continues to be archivable, do all the usual things to ensure it is discoverable by web users and web crawlers, and make sure your 'robots.txt', if any, allows the crawlers which feed the archive to collect your site's resources.
These crawlers identify as either 'ia_archiver' or 'archive.org_bot'. (If seeking to block material from being included in the Wayback Machine, a rule against 'ia_archiver' is enough.)
Several robots.txt validators on the net can double-check the format of your robots.txt file.
Hope this helps,
- Gordon @ IA